109 research outputs found

    Salient Region Segmentation

    Get PDF
    Saliency prediction is a well studied problem in computer vision. Early saliency models were based on low-level hand-crafted feature derived from insights gained in neuroscience and psychophysics. In the wake of deep learning breakthrough, a new cohort of models were proposed based on neural network architectures, allowing significantly higher gaze prediction than previous shallow models, on all metrics. However, most models treat the saliency prediction as a regression problem, and accurate regression of high-dimensional data is known to be a hard problem. Furthermore, it is unclear that intermediate levels of saliency (ie, neither very high, nor very low) are meaningful: Something is either salient, or it is not. Drawing from those two observations, we reformulate the saliency prediction problem as a salient region segmentation problem. We demonstrate that the reformulation allows for faster convergence than the classical regression problem, while performance is comparable to stateof-the-art. We also visualise the general features learned by the model, which are showed to be consistent with insights from psychophysics.Engineering and Physical Sciences Research Council (EPSRC

    Continuous Control with a Combination of Supervised and Reinforcement Learning

    Get PDF
    This is the author accepted manuscript. The final version is available from the Institute of Electrical and Electronics Engineers via the DOI in this record.Reinforcement learning methods have recently achieved impressive results on a wide range of control problems. However, especially with complex inputs, they still require an extensive amount of training data in order to converge to a meaningful solution. This limits their applicability to complex input spaces such as video signals, and makes them impractical for use in complex real world problems, including many of those for video based control. Supervised learning, on the contrary, is capable of learning on a relatively limited number of samples, but relies on arbitrary hand-labelling of data rather than taskderived reward functions, and hence do not yield independent control policies. In this article we propose a novel, modelfree approach, which uses a combination of reinforcement and supervised learning for autonomous control and paves the way towards policy based control in real world environments. We use SpeedDreams/TORCS video game to demonstrate that our approach requires much less samples (hundreds of thousands against millions or tens of millions) comparing to the state-of-theart reinforcement learning techniques on similar data, and at the same time overcomes both supervised and reinforcement learning approaches in terms of quality. Additionally, we demonstrate applicability of the method to MuJoCo control problems.The authors are grateful for the support by the UK Engineering and Physical Sciences Research Council (EPSRC) project DEVA EP/N035399/1

    On-Policy Trust Region Policy Optimisation with Replay Buffers

    Get PDF
    Building upon the recent success of deep reinforcement learning methods, we investigate the possibility of on-policy reinforcement learning improvement by reusing the data from several consecutive policies. On-policy methods bring many benefits, such as ability to evaluate each resulting policy. However, they usually discard all the information about the policies which existed before. In this work, we propose adaptation of the replay buffer concept, borrowed from the off-policy learning setting, to create the method, combining advantages of on- and off-policy learning. To achieve this, the proposed algorithm generalises the QQ-, value and advantage functions for data from multiple policies. The method uses trust region optimisation, while avoiding some of the common problems of the algorithms such as TRPO or ACKTR: it uses hyperparameters to replace the trust region selection heuristics, as well as the trainable covariance matrix instead of the fixed one. In many cases, the method not only improves the results comparing to the state-of-the-art trust region on-policy learning algorithms such as PPO, ACKTR and TRPO, but also with respect to their off-policy counterpart DDPG.Engineering and Physical Sciences Research Council (EPSRC

    Deep saliency: What is learnt by a deep network about saliency?

    Get PDF
    2nd Workshop on Visualization for Deep LearningThis is the author accepted manuscript. The final version is available from the International Machine Learning Society via the URL in this record.Deep convolutional neural networks have achieved impressive performance on a broad range of problems, beating prior art on estab- lished benchmarks, but it often remains unclear what are the representations learnt by those systems and how they achieve such performance. This article examines the specific problem of saliency detection, where benchmarks are currently dominated by CNN-based approaches, and investigates the properties of the learnt rep- resentation by visualizing the artificial neurons’ receptive fields. We demonstrate that fine tuning a pre-trained network on the saliency detection task lead to a profound transformation of the network’s deeper layers. Moreover we argue that this transformation leads to the emergence of receptive fields conceptually similar to the centre-surround filters hypothesized by early research on visual saliency.This work was supported by the EPSRC project DEVA EP/N035399/1

    Salient Region Segmentation

    Get PDF
    Saliency prediction is a well studied problem in computer vision. Early saliency models were based on low-level hand-crafted feature derived from insights gained in neuroscience and psychophysics. In the wake of deep learning breakthrough, a new cohort of models were proposed based on neural network architectures, allowing significantly higher gaze prediction than previous shallow models, on all metrics. However, most models treat the saliency prediction as a regression problem, and accurate regression of high-dimensional data is known to be a hard problem. Furthermore, it is unclear that intermediate levels of saliency (ie, neither very high, nor very low) are meaningful: Something is either salient, or it is not. Drawing from those two observations, we reformulate the saliency prediction problem as a salient region segmentation problem. We demonstrate that the reformulation allows for faster convergence than the classical regression problem, while performance is comparable to stateof-the-art. We also visualise the general features learned by the model, which are showed to be consistent with insights from psychophysics.Engineering and Physical Sciences Research Council (EPSRC

    Contextual relabelling of detected objects

    Get PDF
    This is the author accepted manuscript. The final version is available from IEEE via the DOI in this recordContextual information, such as the co-occurrence of objects and the spatial and relative size among objects provides deep and complex information about scenes. It also can play an important role in improving object detection. In this work, we present two contextual models (rescoring and re-labeling models) that leverage contextual information (16 contextual relationships are applied in this paper) to enhance the state-of-the-art RCNN-based object detection (Faster RCNN). We experimentally demonstrate that our models lead to enhancement in detection performance using the most common dataset used in this field (MSCOCO)

    On-Policy Trust Region Policy Optimisation with Replay Buffers

    Get PDF
    Building upon the recent success of deep reinforcement learning methods, we investigate the possibility of on-policy reinforcement learning improvement by reusing the data from several consecutive policies. On-policy methods bring many benefits, such as ability to evaluate each resulting policy. However, they usually discard all the information about the policies which existed before. In this work, we propose adaptation of the replay buffer concept, borrowed from the off-policy learning setting, to create the method, combining advantages of on- and off-policy learning. To achieve this, the proposed algorithm generalises the QQ-, value and advantage functions for data from multiple policies. The method uses trust region optimisation, while avoiding some of the common problems of the algorithms such as TRPO or ACKTR: it uses hyperparameters to replace the trust region selection heuristics, as well as the trainable covariance matrix instead of the fixed one. In many cases, the method not only improves the results comparing to the state-of-the-art trust region on-policy learning algorithms such as PPO, ACKTR and TRPO, but also with respect to their off-policy counterpart DDPG.Engineering and Physical Sciences Research Council (EPSRC

    How much of driving is pre-attentive?

    Get PDF
    Driving a car in an urban setting is an extremely difficult problem, incorporating a large number of complex visual tasks; however, this problem is solved daily by most adults with little apparent effort. This paper proposes a novel vision-based approach to autonomous driving that can predict and even anticipate a driver's behavior in real time, using preattentive vision only. Experiments on three large datasets totaling over 200 000 frames show that our preattentive model can (1) detect a wide range of driving-critical context such as crossroads, city center, and road type; however, more surprisingly, it can (2) detect the driver's actions (over 80% of braking and turning actions) and (3) estimate the driver's steering angle accurately. Additionally, our model is consistent with human data: First, the best steering prediction is obtained for a perception to action delay consistent with psychological experiments. Importantly, this prediction can be made before the driver's action. Second, the regions of the visual field used by the computational model strongly correlate with the driver's gaze locations, significantly outperforming many saliency measures and comparable to state-of-the-art approaches.European Commission’s Seventh Framework Programme (FP7/2007-2013

    Training-ValueNet: data-driven label noise cleaning on weakly-supervised web images

    Get PDF
    This is the author accepted manuscript. The final version is available from IEEE via the DOI in this recordManually labeling new datasets for image classification remains expensive and time-consuming. A promising alternative is to utilize the abundance of images on the web for which search queries or surrounding text offers a natural source of weak supervision. Unfortunately the label noise in these datasets has limited their use in practice. Several methods have been proposed for performing unsupervised label noise cleaning, the majority of which use outlier detection to identify and remove mislabeled images. In this paper, we argue that outlier detection is an inherently unsuitable approach for this task due to major flaws in the assumptions it makes about the distribution of mislabeled images. We propose an alternative approach which makes no such assumptions. Rather than looking for outliers, we observe that mislabeled images can be identified by the detrimental impact they have on the performance of an image classifier. We introduce training-value as an objective measure of the contribution each training example makes to the validation loss. We then present the training-value approximation network (Training-ValueNet) which learns a mapping between each image and its training-value. We demonstrate that by simply discarding images with a negative training-value, Training-ValueNet is able to significantly improve classification performance on a held-out test set, outperforming the state of the art in outlier detection by a large margin

    Sign Spotting using Hierarchical Sequential Patterns with Temporal Intervals

    Get PDF
    This paper tackles the problem of spotting a set of signs occuring in videos with sequences of signs. To achieve this, we propose to model the spatio-temporal signatures of a sign using an extension of sequential patterns that contain temporal intervals called Sequential Interval Patterns (SIP). We then propose a novel multi-class classifier that organises different sequential interval patterns in a hierarchical tree structure called a Hierarchical SIP Tree (HSP-Tree). This allows one to exploit any subsequence sharing that exists between different SIPs of different classes. Multiple trees are then combined together into a forest of HSP-Trees resulting in a strong classifier that can be used to spot signs. We then show how the HSP-Forest can be used to spot sequences of signs that occur in an input video. We have evaluated the method on both concatenated sequences of isolated signs and continuous sign sequences. We also show that the proposed method is superior in robustness and accuracy to a state of the art sign recogniser when applied to spotting a sequence of signs.This work was funded by the UK government
    • …
    corecore